Parsing and Subcategorization Data
نویسندگان
چکیده
In this paper, we compare the performance of a state-of-the-art statistical parser (Bikel, 2004) in parsing written and spoken language and in generating subcategorization cues from written and spoken language. Although Bikel’s parser achieves a higher accuracy for parsing written language, it achieves a higher accuracy when extracting subcategorization cues from spoken language. Our experiments also show that current technology for extracting subcategorization frames initially designed for written texts works equally well for spoken language. Additionally, we explore the utility of punctuation in helping parsing and extraction of subcategorization cues. Our experiments show that punctuation is of little help in parsing spoken language and extracting subcategorization cues from spoken language. This indicates that there is no need to add punctuation in transcribing spoken corpora simply in order to help parsers.
منابع مشابه
Can Subcategorization Help a Statistical Dependency Parser?
Today there is a relatively large body of work on automatic acquisition of lexicosyntactical preferences (subcategorization) from corpora. Various techniques have been developed that not only produce machinereadable subcategorization dictionaries but also they are capable of weighing the various subcategorization frames probabilistically. Clearly there should be a potential to use such weighted...
متن کاملAcquiring German Prepositional Subcategorization Frames from Corpora
This paper presents a procedure to automaticafly learn German prepositional subcategofization frames fzom text corpora. It is based on shallow parsing techniques employed to identify high-accuracy cues for prepositional frames, the EM algorithm to solve the PP attachment problem implicit in the task, and a method to rank the evidence for subcategorization provided by the collected data.
متن کاملSemantic Parsing based on Verbal Subcategorization
The aim of this work is to explore new methodologies on Semantic Parsing for unrestricted texts. Our approach follows the current trends in Information Extraction (IE) and is based on the application of a verbal subcategorization lexicon (LEXPIR) by means of complex pattern recognition techniques. LEXPIR is framed on the theoretical model of the verbal subcategorization developed in the Pirapid...
متن کاملResearch on Rule - based Chinese Syntactic Parsing Postprocess Using Verb Subcategorization
We propose a simple approach for Chinese syntactic parsing postprocess in this paper. It uses verb subcategorization syntactic mode to match n-best candidate parsing trees outputed from baseline parser system. We extract various features of verb subcategorization from train corpora. And use those features of verb subcategorization extracted from train corpus to rerank the n-best list via a simi...
متن کاملA Corpus-based Conceptual Clustering Method for Verb Frames and Ontology Acquisition
We describe in this paper the ML system, ASIUM, which learns subcategorization frames of verbs and ontologies from syntactic parsing of technical texts in natural language. The restrictions of selection in the subcategorization frames are filled by the concepts of the ontology. Applications requiring subcategorization frames and ontologies are crucial and numerous. The most direct applications ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006